Abstract
Due to the great amount of documents available on the Web, end users need to be able to access information in summary form – keeping the most important information in the document. The methods employed for automatic text summarization generally allocate a score to each sentence in the document, taking into account certain features. The most relevant sentences are then selected, according to the score obtained for each sentence. In this paper, the extractive single document summarization task is treated as a binary optimization problem and, based on the Global-best Harmony Search metaheuristic and a greedy local search procedure, a new algorithm called ESDS-GHS-GLO is proposed. This algorithm optimizes an objective function, which is a lineal normalized combination of the position of the sentence in the document, sentence length, and coverage of the selected sentences in the summary. The proposed method was compared with the state of the art methods MA-SingleDocSum, DE, FEOM, UnifiedRank, NetSum, QCS, CRF, SVM, and Manifold Ranking, using ROUGE measures on the DUC2001 and DUC2002 datasets. The results showed that ESDS-GHS-GLO outperforms most of the state-of-the-art methods except MA-SingleDocSum. ESDS-GHS-GLO obtains promissory results using a fitness function less complex than MA-SingleDocSum, therefore requiring less execution time.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Nenkova, A., McKeown, K.: A survey of text summarization techniques. In: Aggarwal, C.C., Zhai, C. (eds.) Mining Text Data, pp. 43–76. Springer, New York (2012)
Lloret, E., Palomar, M.: Text summarisation in progress: a literature review. Artif. Intell. Rev. 37(1), 1–41 (2012)
Miranda, S., Gelbukh, A., Sidorov, G.: Generación de resúmenes por medio de síntesis de grafos conceptuales. Revista Signos. Estudios de Lingüística 47(86) (2014)
Edmundson, H.P.: New methods in automatic extracting. J. ACM 16(2), 264–285 (1969)
Aone, C., et al., Trainable, scalable summarization using robust NLP and Machine Learning. In: Mani, I., Maybury, M.T. (eds.) Advances in Automatic Text Summarization, pp. 71–80 (1999)
Kupiec, J., Pedersen, J., Chen. F.: A trainable document summarizer. In: Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, Seattle, Washington, USA (1995)
Dunlavy, D.M., et al.: QCS: a system for querying, clustering and summarizing documents. Inf. Process. Manage. 43(6), 1588–1605 (2007)
Conroy, J., O’leary, D.: Text summarization via hidden Markov models. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, New Orleans, Louisiana, USA (2001)
Svore, K., Vanderwende, L., Burges, C.: Enhancing single-document summarization by combining RankNet and third-party sources. In: Proceedings of the EMNLP-CoNLL (2007)
Shen, D., et al.: Document summarization using conditional random fields. In: Proceedings of the 20th International Joint Conference on Artificial Intelligence. Morgan Kaufmann Publishers Inc., Hyderabad, India (2007)
Wong, K.-F., Wu, M., Li, W.: Extractive summarization using supervised and semi-supervised learning. In: Proceedings of the 22nd International Conference on Computational Linguistics. Association for Computational Linguistics, Manchester, UK (2008)
Marcu, D.: Improving summarization through rhetorical parsing tuning. In: Proceedings of the Sixth Workshop on Very Large Corpora, Montreal, Canada (1998)
Ono, K., Sumita, K., Miike, S.: Abstract generation based on rhetorical structure extraction. In: Proceedings of the 15th Conference on Computational Linguistics. Association for Computational Linguistics, Kyoto, Japan (1994)
Barzilay, R., Elhadad, M.: Using lexical chains for text summarization. In: Proceedings of the ACL/EACL 1997 Workshop on Intelligent Scalable Text Summarization, Madrid, Spain (1997)
Louis, A., Joshi, A., Nenkova, A.: Discourse indicators for content selection in summarization. In: Proceedings of the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pp. 147–156. Association for Computational Linguistics, Tokyo, Japan (2010)
Mihalcea, R., Tarau, P.: Text-rank: bringing order into texts. In: Proceeding of the Conference on Empirical Methods in Natural Language Processing, Barcelona, Spain (2004)
Wan, X.: Towards a unified approach to simultaneous single-document and multi-document summarizations. In: Proceeding of the 23rd International Conference on Computational Linguistics (Coling 2010), Beijing (2010)
Gong, Y.: Generic text summarization using relevance measure and latent semantic analysis. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (2001)
Steinberger, J., Jezek, K.: Using latent semantic analysis in text summarization and summary evaluation. In: Proceedings of the 7th International Conference ISIM (2004)
Yeh, J.-Y., et al.: Text summarization using a trainable summarizer and latent semantic analysis. Inf. Process. Manage. 41(1), 75–95 (2005)
Steinberger, J., Ježek, K.: Sentence compression for the LSA-based summarizer, pp. 141–148 (2006)
Lee, J.-H., et al.: Automatic generic document summarization based on non-negative matrix factorization. Inf. Process. Manage. 45(1), 20–34 (2009)
Dehkordi, P.-K., Kumarci, F., Khosravi, H.: Text summarization based on genetic programming. In: Proceedings of the International Journal of Computing and ICT Research (2009)
Qazvinian, V., Sharif, L., Halavati, R.: Summarising text with a genetic algorithm-based sentence extraction. Int. J. Knowl. Manage. Stud. (IJKMS) 4(4), 426–444 (2008)
García-Hernández, R.A., Ledeneva, Y.: Single extractive text summarization based on a genetic algorithm. In: Carrasco-Ochoa, J.A., Martínez-Trinidad, J.F., Rodríguez, J.S., Baja, G.S. (eds.) MCPR 2012. LNCS, vol. 7914, pp. 374–383. Springer, Heidelberg (2013)
Litvak, M., Last, M., Friedman, M.: A new approach to improving multilingual summarization using a genetic algorithm. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Uppsala, Sweden (2010)
Fattah, M.A., Ren, F.: GA, MR, FFNN, PNN and GMM based models for automatic text summarization. Comput. Speech Lang. 23(1), 126–144 (2009)
Meena, Y.K., Gopalani, D.: Evolutionary algorithms for extractive automatic text summarization. Procedia Comput. Sci. 48, 244–249 (2015)
Binwahlan, M.S., Salim, N., Suanmali, L.: Swarm diversity based text summarization. In: Leung, C.S., Lee, M., Chan, J.H. (eds.) ICONIP 2009, Part II. LNCS, vol. 5864, pp. 216–225. Springer, Heidelberg (2009)
Shareghi, E., Hassanabadi, L.S.: Text summarization with harmony search algorithm-based sentence extraction. In: Proceedings of the 5th International Conference on Soft Computing as Transdisciplinary Science and Technology. Cergy-Pontoise, France (2008)
Aliguliyev, R.M.: A new sentence similarity measure and sentence based extractive technique for automatic text summarization. Expert Syst. Appl. 36(4), 7764–7772 (2009)
Abuobieda, A., Salim, N., Kumar, Y.J., Osman, A.H.: An improved evolutionary algorithm for extractive text summarization. In: Selamat, A., Nguyen, N.T., Haron, H. (eds.) ACIIDS 2013, Part II. LNCS, vol. 7803, pp. 78–89. Springer, Heidelberg (2013)
Binwahlan, M.S., Salim, N., Suanmali, L.: Fuzzy swarm diversity hybrid model for text summarization. Inf. Process. Manage. 46, 571–588 (2010)
Song, W., et al.: Fuzzy evolutionary optimization modeling and its applications to unsupervised categorization and extractive summarization. Expert Syst. Appl. 38(8), 9112–9121 (2011)
Mendoza, M., et al.: Extractive single-document summarization based on genetic operators and guided local search. Expert Syst. Appl. 41(9), 4158–4169 (2014)
Garcia-Martinez, C., Rodriguez, F.J., Lozano, M.: Analysing the significance of no free lunch theorems on the set of real-world binary problems. In: 2011 11th International Conference on Intelligent Systems Design and Applications (ISDA) (2011)
Omran, M.G.H., Mahdavi, M.: Global-best harmony search. Appl. Math. Comput. 198(2), 643–656 (2008)
Geem, Z.W.: Music-Inspired Harmony Search Algorithm: Theory and Applications. Studies in Computational Intelligence, vol. 191, 206. Springer Publishing Company, Incorporated, Rockville, Maryland (2009)
Manning, C., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)
Hachey, B., Murray, G., Reitter, D.: The Embra System at DUC 2005: query-oriented multi-document summarization with a very large latent semantic space. In: Proceedings of the Document Understanding Conference (DUC), Vancouver, Canada (2005)
Alguliev, R.M., et al.: MCMR: Maximum coverage and minimum redundant text summarization model. Expert Syst. Appl. 38, 14514–14522 (2011)
Lin, C.-Y., Hovy, E.: Identifying topics by position. In: Proceedings of the Fifth Conference on Applied Natural Language Processing, San Francisco, CA, USA (1997)
Ochoa, G., Verel, S., Tomassini, M.: First-improvement vs. best-improvement local optima networks of NK landscapes. In: Schaefer, R., Cotta, C., Kołodziej, J., Rudolph, G. (eds.) PPSN XI. LNCS, vol. 6238, pp. 104–113. Springer, Heidelberg (2010)
Lin, C.-Y.: Rouge: a package for automatic evaluation of summaries. In: Proceedings of the ACL-04 Workshop on Text Summarization Branches Out, Barcelona, Spain (2004)
Eiben, A.E., Smit, S.K.: Evolutionary algorithm parameters and methods to tune them. In: Monfroy, E., Hamadi, Y., Saubion, F. (eds.) Autonomous Search, pp. 15–36. Springer, Berlin (2012)
Cobos, C., Estupiñán, D., Pérez, J.: GHS + LEM: global-best Harmony Search using learnable evolution models. Appl. Math. Comput. 218(6), 2558–2578 (2011)
Sidorov, G., et al.: Soft similarity and soft cosine measure: similarity of features in vector space model. Computación y Sistemas 18(3) (2014)
Acknowledgments
The work in this paper was supported by the University of Cauca and the National University of Colombia. We are especially grateful to Colin McLachlan for suggestions relating to English text.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Mendoza, M., Cobos, C., León, E. (2015). Extractive Single-Document Summarization Based on Global-Best Harmony Search and a Greedy Local Optimizer. In: Pichardo Lagunas, O., Herrera Alcántara, O., Arroyo Figueroa, G. (eds) Advances in Artificial Intelligence and Its Applications. MICAI 2015. Lecture Notes in Computer Science(), vol 9414. Springer, Cham. https://doi.org/10.1007/978-3-319-27101-9_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-27101-9_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-27100-2
Online ISBN: 978-3-319-27101-9
eBook Packages: Computer ScienceComputer Science (R0)